AITopics

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.69)
Health & Medicine > Health Care Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.64)

Neural Information Processing SystemsFeb-15-2026, 05:54:12 GMT

Normalization Helps Training of Quantized LSTM

Lu Hou, Jinhua Zhu, James Kwok, Fei Gao, Tao Qin, Tie-Yan Liu

However, existing quantization methods usually have inferior performancewhenusedonLSTMs.

artificial intelligence, batch, machine learning, (20 more...)

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > China > Beijing > Beijing (0.04)
Asia > China > Anhui Province > Hefei (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.63)

Shailee Jain, Alexander Huth

Incorporating Context into Language Encoding Models for fMRI

Neural Information Processing SystemsNov-20-2025, 21:13:34 GMT

Language encoding models help explain language processing in the human brain by learning functions that predict brain responses from the language stimuli that elicited them.

artificial intelligence, machine learning, natural language, (20 more...)

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.69)
Health & Medicine > Health Care Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.64)

Baumgartner, Caroline, Spens, Eleanor, Burgess, Neil, Manescu, Petru

Cognitive Maps in Language Models: A Mechanistic Analysis of Spatial Planning

arXiv.org Artificial IntelligenceNov-18-2025

How do large language models solve spatial navigation tasks? We investigate this by training GPT-2 models on three spatial learning paradigms in grid environments: passive exploration (Foraging Model- predicting steps in random walks), goal-directed planning (generating optimal shortest paths) on structured Hamiltonian paths (SP-Hamiltonian), and a hybrid model fine-tuned with exploratory data (SP-Random Walk). Using behavioural, representational and mechanistic analyses, we uncover two fundamentally different learned algorithms. The Foraging model develops a robust, map-like representation of space, akin to a 'cognitive map'. Causal interventions reveal that it learns to consolidate spatial information into a self-sufficient coordinate system, evidenced by a sharp phase transition where its reliance on historical direction tokens vanishes by the middle layers of the network. The model also adopts an adaptive, hierarchical reasoning system, switching between a low-level heuristic for short contexts and map-based inference for longer ones. In contrast, the goal-directed models learn a path-dependent algorithm, remaining reliant on explicit directional inputs throughout all layers. The hybrid model, despite demonstrating improved generalisation over its parent, retains the same path-dependent strategy. These findings suggest that the nature of spatial intelligence in transformers may lie on a spectrum, ranging from generalisable world models shaped by exploratory data to heuristics optimised for goal-directed tasks. We provide a mechanistic account of this generalisation-optimisation trade-off and highlight how the choice of training regime influences the strategies that emerge.

large language model, machine learning, natural language, (22 more...)

2511.13371

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.69)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Maltoni, Davide, Ferrara, Matteo

Toward Mechanistic Explanation of Deductive Reasoning in Language Models

arXiv.org Artificial IntelligenceOct-13-2025

Recent large language models have demonstrated relevant capabilities in solving problems that require logical reasoning; however, the corresponding internal mechanisms remain largely unexplored. In this paper, we show that a small language model can solve a deductive reasoning task by learning the underlying rules (rather than operating as a statistical learner). A low-level explanation of its internal representations and computational circuits is then provided. Our findings reveal that induction heads play a central role in the implementation of the rule completion and rule chaining steps involved in the logical inference required by the task. Introduction Recent Large Language Models (LLMs) have demonstrated remarkable capabilities in reasoning and problem-solving (Huang and Chang, 2023). Many approaches have focused on enhancing logical reasoning in LLMs, with a growing body of work introducing formal and symbolic logic-based benchmarks (Liu et al., 2025). While much of the literature emphasizes solving reasoning benchmarks, comparatively less attention has been devoted to understanding and explaining the underlying low-level computational mechanisms. Y et, interpretability is crucial for designing more robust and targeted models, that are less prone to errors.

large language model, natural language, residual stream, (16 more...)

2510.0934

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Neural Information Processing SystemsOct-10-2025, 04:29:30 GMT

630d293833e09e1ecd892a898a20b074-Paper-Conference.pdf

attention head, matrix, matrix completion, (15 more...)

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Michigan (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(5 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Fay, Aideen, García-Redondo, Inés, Wang, Qiquan, Dubossarsky, Haim, Monod, Anthea

The Shape of Adversarial Influence: Characterizing LLM Latent Spaces with Persistent Homology

arXiv.org Artificial IntelligenceOct-10-2025

Existing interpretability methods for Large Language Models (LLMs) often fall short by focusing on linear directions or isolated features, overlooking the high-dimensional, nonlinear, and relational geometry within model representations. This study focuses on how adversarial inputs systematically affect the internal representation spaces of LLMs, a topic which remains poorly understood. We propose persistent homology (PH), a tool from topological data analysis, as a principled framework to characterize the multi-scale dynamics within LLM activations. Using PH, we systematically analyze six state-of-the-art models under two distinct adversarial conditions, indirect prompt injection and backdoor fine-tuning, and identify a consistent topological signature of adversarial influence. Across architectures and model sizes, adversarial inputs induce ``topological compression'', where the latent space becomes structurally simpler, collapsing from varied, compact, small-scale features into fewer, dominant, and more dispersed large-scale ones. This topological signature is statistically robust across layers, highly discriminative, and provides interpretable insights into how adversarial effects emerge and propagate. By quantifying the shape of activations and neuronal information flow, our architecture-agnostic framework reveals fundamental invariants of representational change, offering a complementary perspective to existing interpretability methods.

large language model, machine learning, natural language, (20 more...)

2505.20435

Country:

North America > United States > Minnesota (0.28)
Europe > United Kingdom > England (0.28)
Asia (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.94)

Industry: Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Artificial IntelligenceOct-10-2025

Feature Identification via the Empirical NTK

Lin, Jennifer

We provide evidence that eigenanalysis of the empirical neural tangent kernel (eNTK) can surface the features used by trained neural networks. Across two standard toy models for mechanistic interpretability, Toy Models of Superposition (TMS) and a 1-layer MLP trained on modular addition, we find that the eNTK exhibits sharp spectral cliffs whose top eigenspaces align with ground-truth features. In TMS, the eNTK recovers the ground-truth features in both the sparse (high superposition) and dense regimes. In modular arithmetic, the eNTK can be used to recover Fourier feature families. Moreover, we provide evidence that a layerwise eNTK localizes features to specific layers and that the evolution of the eNTK spectrum can be used to diagnose the grokking phase transition. These results suggest that eNTK analysis may provide a practical handle for feature discovery and for detecting phase changes in small models.

artificial intelligence, eigenvector, machine learning, (16 more...)

2510.00468

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)